Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Properties FFI #1269

Merged
merged 14 commits into from
Nov 15, 2021
Merged

Initial Properties FFI #1269

merged 14 commits into from
Nov 15, 2021

Conversation

sffc
Copy link
Member

@sffc sffc commented Nov 9, 2021

Progress on #1159

I added an FFI for one binary property and one enumerated property.

For the enumerated property, I'm not quite sure what to do about the type signature and return type. Since we return a (wrapper around a) DataPayload<T>, I think I need to make a unique FFI type for every enumerated property we support. (I would like to get away from this after #1262 is done.) And for now I made it return a u32 for the script, but should I put the newtype into FFI and make it return the newtype instead?

Also, is there a way to avoid duplicating the giant list of script constants into the FFI file?

@sffc sffc marked this pull request as ready for review November 9, 2021 03:14
@sffc sffc requested a review from a team as a code owner November 9, 2021 03:14
Manishearth
Manishearth previously approved these changes Nov 10, 2021
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quote good!

/// The [`ICU4XUnicodeScriptMapProperty`], if creation was successful.
pub data: Option<Box<ICU4XUnicodeScriptMapProperty>>,
/// Whether creating the [`ICU4XUnicodeScriptMapProperty`] was successful.
pub success: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought (nb): Tbh I feel like the success is redundant if data is an Option.

I eventually want to use Results here but I'd like to fix some things in Diplomat first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from one of the other components (I think FixedDecimalFormat). I do absolutely think we need to invest more in our Result story. But I don't want to diverge from what we are doing elsewhere until we do it everywhere.

@@ -0,0 +1,28 @@
#ifndef ICU4XUnicodeScriptMapProperty_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(thought)

This file is auto-generated, right?

Would it be possible to have Diplomat add a "this file was autogenerated by ..." comment at the top of generated files, to make it easier for reviewers / readers of the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's a good idea. rust-diplomat/diplomat#100

@@ -0,0 +1,24 @@
#ifndef ICU4XUnicodeScriptMapPropertyResult_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have this in a separate file from ICU4XUnicodeScriptMapProperty.h, or is it just an implementation detail of Diplomat?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation detail of Diplomat.

Comment on lines 32 to 35
/**
* Gets a set for Unicode property ascii_hex_digit from a [`ICU4XDataProvider`].
* See [the Rust docs](https://unicode-org.github.io/icu4x-docs/doc/icu_properties/sets/fn.get_ascii_hex_digit.html) for more information.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional for the C++ headers to include comments, but not the C headers? It makes sense that the copies of the C headers in ffi/diplomat/cpp are just an implementation detail of the C++ headers, but the same argument doesn't seem to hold for ffi/diplomat/c.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregtatum added C++ docs generation in rust-diplomat/diplomat#85. The primary documentation is intended to be the *.rst files, but perhaps we should put the docs into the *.h files as well. rust-diplomat/diplomat#101

}

impl ICU4XUnicodeScriptMapProperty {
/// Gets a set for Unicode property ascii_hex_digit from a [`ICU4XDataProvider`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This comment and the one for try_get_from_static below seem to incorrectly specify "ascii_hex_digit"?

@sffc
Copy link
Member Author

sffc commented Nov 13, 2021

So the binary is slightly different (slightly larger) after changing to use map_project: the default debug build is 24230864 B before the change and 24287280 B after. I used BinDiff to identify symbols that were added to the new binary, and here is what it says:

Format Pattern Result
000629D0 core::ops::function::FnOnce::call_once::h5497310d930e709f Normal 1 0 1 0 1
00065740 core::ptr::drop_in_place$LT$yoke..yoke..Yoke$LT$icu_properties..provider..UnicodePropertyMapV1$LT$u16$GT$$C$$LP$$RP$$GT$$GT$::h592144a2c3cf0302 Normal 1 0 1 0 1
00066210 core::ptr::drop_in_place$LT$icu_provider..data_provider..DataPayload$LT$icu_properties..provider..UnicodePropertyMapV1Marker$LT$u16$GT$$GT$$GT$::h2d8628594f47424d Normal 1 0 1 1 1
00066550 core::ptr::drop_in_place$LT$icu_provider..data_provider..DataPayloadInner$LT$icu_properties..provider..UnicodePropertyMapV1Marker$LT$u16$GT$$GT$$GT$::h3cbdaa1a836756a6 Normal 6 7 6 1 1
000665C0 core::ptr::drop_in_place$LT$yoke..yoke..Yoke$LT$icu_properties..provider..UnicodePropertyMapV1$LT$u16$GT$$C$alloc..rc..Rc$LT$$u5b$u8$u5d$$GT$$GT$$GT$::h70ddd14379891dd3 Normal 1 0 1 1 1
000671A0 core::ptr::drop_in_place$LT$yoke..yoke..Yoke$LT$icu_properties..provider..UnicodePropertyMapV1$LT$u16$GT$$C$alloc..rc..Rc$LT$dyn$u20$icu_provider..data_provider..ErasedDestructor$GT$$GT$$GT$::h3b52356d8dbdade5 Normal 1 0 1 0 1
00068910 core::ptr::drop_in_place$LT$icu_properties..provider..UnicodePropertyMapV1$LT$u16$GT$$GT$::hea8b0ae3a58cc57f Normal 1 0 1 3 1
00068960 core::ptr::drop_in_place$LT$icu_codepointtrie..codepointtrie..CodePointTrie$LT$u16$GT$$GT$::hd2843e51a107a054 Normal 1 0 1 1 1
00068C00 core::ptr::drop_in_place$LT$zerovec..ule..error..ULEError$LT$core..convert..Infallible$GT$$GT$::h29c8ca77744976be Normal 1 0 1 0 0
0006ABD8 sub_0006ABD8 Normal 1 0 2 0 1
0006F1F0 core::ptr::read::h2d57cbaee0462741 Normal 1 0 1 0 1
00080975 sub_00080975 Normal 1 0 2 0 1
0009A3BD sub_0009A3BD Normal 1 0 2 0 1
0009A46D sub_0009A46D Normal 1 0 2 0 1
00232110 alloc::vec::Vec$LT$T$C$A$GT$::capacity::h2bc9b35efbab5b5e Normal 1 0 1 0 0
00233C40 icu_provider::data_provider::DataPayload$LT$M$GT$::try_map_project_with_capture::hb2258465fba04ee6 Normal 14 18 20 1 1
00236204 sub_00236204 Normal 1 0 1 0 1
00236230 yoke::yoke::Yoke$LT$Y$C$C$GT$::try_project_with_capture::h1f2a391f17911b3a Normal 4 4 6 0 1
00236852 sub_00236852 Normal 1 0 1 0 1
00237200 yoke::yoke::Yoke$LT$Y$C$C$GT$::try_project_with_capture::hbae63b2e8c313d94 Normal 4 4 6 0 1
00246670 icu_capi::properties_maps::ffi::ICU4XCodePointMapData16::prepare_result_from_script::_$u7b$$u7b$closure$u7d$$u7d$::h27fc1408e1d62ca4 Normal 4 4 5 1 1
00253330 icu_codepointtrie::codepointtrie::CodePointTrie$LT$T$GT$::try_into_converted::h77577245bff7ff45 Normal 4 4 5 0 1
00265A00 _$LT$zerovec..ule..plain..PlainOldULE$LT$2_usize$GT$$u20$as$u20$zerovec..ule..ULE$GT$::validate_byte_slice::hd3483422b2486d88 Normal 4 4 4 0 0
0028E2D0 core::mem::size_of_val::h80226cce0d71b350 Normal 1 0 1 1 0
002906F0 core::mem::forget::h60c8b678b9fb1502 Normal 1 0 1 0 1
002A26F0 _$LT$core..result..Result$LT$T$C$F$GT$$u20$as$u20$core..ops..try_trait..FromResidual$LT$core..result..Result$LT$core..convert..Infallible$C$E$GT$$GT$$GT$::from_residual::h0ba03e9318ed3321 Normal 1 0 1 0 1
002A3710 _$LT$core..result..Result$LT$T$C$F$GT$$u20$as$u20$core..ops..try_trait..FromResidual$LT$core..result..Result$LT$core..convert..Infallible$C$E$GT$$GT$$GT$::from_residual::h7ee84fba62a14071 Normal 1 0 1 0 1
002A3DE0 _$LT$core..result..Result$LT$T$C$F$GT$$u20$as$u20$core..ops..try_trait..FromResidual$LT$core..result..Result$LT$core..convert..Infallible$C$E$GT$$GT$$GT$::from_residual::ha8efd718d363ad3a Normal 1 0 1 0 1
002A42B0 _$LT$core..result..Result$LT$T$C$F$GT$$u20$as$u20$core..ops..try_trait..FromResidual$LT$core..result..Result$LT$core..convert..Infallible$C$E$GT$$GT$$GT$::from_residual::hbd402423af5ec151 Normal 1 0 1 0 1
002A47A0 _$LT$core..result..Result$LT$T$C$F$GT$$u20$as$u20$core..ops..try_trait..FromResidual$LT$core..result..Result$LT$core..convert..Infallible$C$E$GT$$GT$$GT$::from_residual::hf7e3f140809c2250 Normal 1 0 1 0 1
002B5110 core::result::Result$LT$T$C$E$GT$::expect::h77660807f493e861 Normal 3 2 4 0 1
002B7D20 _$LT$core..result..Result$LT$T$C$E$GT$$u20$as$u20$core..ops..try_trait..Try$GT$::branch::h45004bb6828788aa Normal 4 4 5 0 1
002B9280 _$LT$core..result..Result$LT$T$C$E$GT$$u20$as$u20$core..ops..try_trait..Try$GT$::branch::h8f6e26cc7cc28310 Normal 4 4 5 0 1
002B9690 _$LT$core..result..Result$LT$T$C$E$GT$$u20$as$u20$core..ops..try_trait..Try$GT$::branch::h943dc69f9df5a0c1 Normal 4 4 5 0 1
002B9AD0 _$LT$core..result..Result$LT$T$C$E$GT$$u20$as$u20$core..ops..try_trait..Try$GT$::branch::hab32c70c1ca4f48f Normal 4 4 5 0 1
002BB100 _$LT$core..result..Result$LT$T$C$E$GT$$u20$as$u20$core..ops..try_trait..Try$GT$::branch::hdd16853a2bb69523 Normal 4 4 5 0 0
002C8080 _$LT$icu_properties..provider..UnicodePropertyMapV1$LT$T$GT$$u20$as$u20$yoke..yokeable..Yokeable$GT$::transform_owned::he05736696ee1afeb Normal 1 0 1 3 1
002C81C0 _$LT$icu_properties..provider..UnicodePropertyMapV1$LT$T$GT$$u20$as$u20$yoke..yokeable..Yokeable$GT$::make::h7f6db4124f1c7e1a Normal 3 2 4 0 1
00349BE0 zerovec::ule::ULE::as_byte_slice::h6cb1320ae01df24d Normal 1 0 1 0 1

The good news is that all of these functions are small, and an optimized build should inline most of them, mitigating binary size impact. There are no trait objects or extra standard library stuff that is being pulled in (as expected). So I'm not going to block this change on the code size diff.

Manishearth
Manishearth previously approved these changes Nov 15, 2021
Manishearth
Manishearth previously approved these changes Nov 15, 2021
echeran
echeran previously approved these changes Nov 15, 2021
ffi/diplomat/src/properties_maps.rs Show resolved Hide resolved
@sffc
Copy link
Member Author

sffc commented Nov 15, 2021

I intend to merge this as soon as CI comes back green. @iainireland left a non-blocking review last week, and I integrated the suggestions from that review, and you're welcome to add more post-submit feedback.

@sffc sffc merged commit 242fa55 into unicode-org:main Nov 15, 2021
@sffc sffc deleted the prop-ffi branch November 15, 2021 21:07
@sffc sffc linked an issue Nov 18, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement FFI for Unicode Properties
4 participants